{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-07-24T14:46:25.538529Z",
     "iopub.status.busy": "2024-07-24T14:46:25.538280Z",
     "iopub.status.idle": "2024-07-24T14:46:25.545724Z",
     "shell.execute_reply": "2024-07-24T14:46:25.545041Z",
     "shell.execute_reply.started": "2024-07-24T14:46:25.538501Z"
    },
    "frozen": false,
    "init_cell": true,
    "tags": [
     "safe_output"
    ]
   },
   "source": [
    "# Autoloading\n",
    "\n",
    "Loading graph visualisation settings."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "init_cell": true
   },
   "outputs": [],
   "source": [
    "%%capture \"Remove this line to see debug information\"\n",
    "%%graph_notebook_vis_options\n",
    "{\n",
    "  \"edges\": {\n",
    "    \"smooth\": {\n",
    "      \"enabled\": true,\n",
    "      \"type\": \"dynamic\"\n",
    "    },\n",
    "    \"arrows\": {\n",
    "      \"to\": {\n",
    "        \"enabled\": true,\n",
    "        \"type\": \"arrow\"\n",
    "      }\n",
    "    }\n",
    "  }\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "frozen": false,
    "tags": [
     "unsafe_output"
    ]
   },
   "source": [
    "# Initial Setup\n",
    "\n",
    "## Get a view of all Ingested Cluster\n",
    "\n",
    "Retrieve all the current cluster ingested in KubeHound with the associated runID with the number of nodes. This numbers can be used to get a clue of the size of the cluster and also identify if an ingestion did not complete."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-07-24T14:46:28.273187Z",
     "iopub.status.busy": "2024-07-24T14:46:28.272852Z",
     "iopub.status.idle": "2024-07-24T14:46:28.625859Z",
     "shell.execute_reply": "2024-07-24T14:46:28.625126Z",
     "shell.execute_reply.started": "2024-07-24T14:46:28.273156Z"
    },
    "frozen": false,
    "tags": [
     "safe_output"
    ]
   },
   "outputs": [],
   "source": [
    "%%gremlin -d class -g critical -le 50 -p inv,oute\n",
    "\n",
    "kh.nodes()\n",
    "    .groupCount()\n",
    "    .by(project('cluster','runID')\n",
    "         .by('cluster').by('runID'))\n",
    "    .unfold()\n",
    "    .limit(1000)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "frozen": false,
    "tags": [
     "unsafe_output"
    ]
   },
   "source": [
    "## Setting your run_id/cluster\n",
    "\n",
    "Set which runID you want to use. The variable are being shared with all users of the instance, so we advise to make a uniq string for your user `runID_yourid` to avoid any conflict."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "frozen": false,
    "tags": [
     "safe_output"
    ]
   },
   "outputs": [],
   "source": [
    "%%gremlin -d class -g critical -le 50 -p inv,oute\n",
    "\n",
    "graph.variables()\n",
    "    .set('runID_yourid','01htdgjj34mcmrrksw4bjy2e94')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "frozen": false,
    "tags": [
     "unsafe_output"
    ]
   },
   "source": [
    "# Endpoints\n",
    "\n",
    "Identify attack path from endpoints. Get a view of all endpoints leading to a critical path (full take over on the cluster).\n",
    "\n",
    "## Identify the vulnerable app/namespace\n",
    "\n",
    "The goal of this list is to identify endpoints leading to a critical path. The list here is exhaustive _by port_ which means we will deduplicate the result by the k8s label `app` or `namespace`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "frozen": false,
    "tags": [
     "safe_output"
    ]
   },
   "outputs": [],
   "source": [
    "%%gremlin -d class -g critical -le 50 -p inv,oute\n",
    "\n",
    "kh.endpoints()\n",
    "    .has(\"runID\", graph.variables().get('runID_yourid').get().trim())\n",
    "    .hasCriticalPath()\n",
    "    .dedup()\n",
    "    .by(\"namespace\")\n",
    "    .by(\"port\")\n",
    "    .valueMap(\"namespace\",\"app\",\"team\",\"portName\",\"port\",\"serviceDns\",\"exposure\")\n",
    "    .limit(100)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "frozen": false,
    "tags": [
     "unsafe_output"
    ]
   },
   "source": [
    "If the list above is still too big to handle you can start with a more narrow view. The following list give a more abstract view to get deduplicated list of vulnerable `app`/`namespace`.\n",
    "\n",
    "If the k8s label `app` is not set properly, scope it by `namespace`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "frozen": false,
    "tags": [
     "safe_output"
    ]
   },
   "outputs": [],
   "source": [
    "%%gremlin -d class -g critical -le 50 -p inv,oute\n",
    "\n",
    "kh.endpoints()\n",
    "    .has(\"runID\", graph.variables().get('runID_yourid').get().trim())\n",
    "    .hasCriticalPath()\n",
    "    .dedup()\n",
    "    .by(\"namespace\")\n",
    "    .by(\"app\")\n",
    "    .valueMap(\"namespace\",\"app\")\n",
    "    .limit(100)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The goal here is to extract a list of apps for which you accept the risk for XYZ reason, to ignore them in queries. You can set this exclude list of `app` or `namespace` using gremlin variables in the following cell:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%gremlin -d class -g critical -le 50 -p inv,oute\n",
    "\n",
    "graph.variables()\n",
    "    .set('endpoits_whiteListedApp_yourid',['WHITELISTED_APP1', \"WHITELISTED_APP2\"])\n",
    "\n",
    "graph.variables()\n",
    "    .set('endpoits_whiteListedNamespace_yourid',['NAMESPACE1', \"NAMESPACE2\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "frozen": false,
    "tags": [
     "unsafe_output"
    ]
   },
   "source": [
    "## Manual investigation for each app/namespace\n",
    "\n",
    "From the above list, you can iterate manual investigation by scoping by each vulnerable `app`/`namespace`. To proceed with the investigation, just copy/paste the name of the vulnerable app (replace `VULNERABLE_APP` by the targetted app)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%gremlin -d class -g critical -le 50 -p inv,oute\n",
    "\n",
    "graph.variables()\n",
    "    .set('endpoint_vulnApp_yourid','VULNERABLE_APP')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Listing all attack paths from a particular app\n",
    "\n",
    "The following gremlin request will **list all attack paths from the selected app**. We add a limit(1000) to avoid having huge graph."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "frozen": false,
    "scrolled": true,
    "tags": [
     "safe_output"
    ]
   },
   "outputs": [],
   "source": [
    "%%gremlin -d class -g critical -le 50 -p inv,oute\n",
    "\n",
    "kh.endpoints()\n",
    "    .has(\"runID\", graph.variables().get('runID_yourid').get().trim())\n",
    "    .has(\"app\",graph.variables().get('endpoint_vulnApp_yourid').get().trim())\n",
    "    .criticalPaths()\n",
    "    .by(elementMap())\n",
    "    .limit(1000)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "frozen": false,
    "tags": [
     "unsafe_output"
    ]
   },
   "source": [
    "The last view can already be quite overwhelming, even if it might not be an exhaustive view (as we capped the result with `limit(1000)`). Increasing the limit will not solve the issue as it will become humanly unreadable.\n",
    "\n",
    "### Listing all attack path deduplicated by app from a particular app \n",
    "\n",
    "One way to solve it is to generate an **overall view to understand the attack path**. This view will strip any specific information (image, ids, ...) and keep only 3 labels:\n",
    "* the `app` label which specify what is associated application\n",
    "* the `class` of the object (node, pod, role, ...) \n",
    "* if the resource is `critical`. \n",
    "\n",
    "For instance, this will remove any replicatset duplication."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "frozen": false,
    "tags": [
     "safe_output"
    ]
   },
   "outputs": [],
   "source": [
    "%%gremlin -d class -g critical -le 50 -p inv,oute\n",
    "\n",
    "kh.endpoints()\n",
    "    .has(\"runID\", graph.variables().get('runID_yourid').get().trim())\n",
    "    .has(\"app\",graph.variables().get('endpoint_vulnApp_yourid').get().trim())\n",
    "    .criticalPaths()\n",
    "    .by(valueMap(\"app\", \"class\",\"critical\").with(WithOptions.tokens,WithOptions.labels))\n",
    "    .limit(10000)\n",
    "    .dedup()\n",
    "    .limit(1000)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "frozen": false,
    "tags": [
     "unsafe_output"
    ]
   },
   "source": [
    "### Listing all attack path deduplicated by label/type from a particular app \n",
    "\n",
    "Sometimes, the previous view is still too big and return too many elements to be easily processable. So, to get an even widder picture, we can deduplicate the attack paths by k8s resource type only. This show the interaction from one type (endpoints/containers/nodes/...) to try to understand the bigger picture."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "frozen": false,
    "tags": [
     "safe_output"
    ]
   },
   "outputs": [],
   "source": [
    "%%gremlin -d class -g critical -le 50 -p inv,oute\n",
    "\n",
    "kh.endpoints()\n",
    "    .has(\"runID\", graph.variables().get('runID_yourid').get().trim())\n",
    "    .has(\"app\",graph.variables().get('endpoint_vulnApp_yourid').get().trim())\n",
    "    .criticalPaths()\n",
    "    .by(label())\n",
    "    .dedup()\n",
    "    .limit(1000)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Global view using the whitelisted approach\n",
    "\n",
    "We are reusing the same queries as previously but instead of iterating over each app, we take the problem more globaly. This approach can be quicker but needs to have a smaller or secure cluster.\n",
    "\n",
    "### Listing all attack paths (except the whitelisted one)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%gremlin -d class -g critical -le 50 -p inv,oute\n",
    "\n",
    "kh.endpoints()\n",
    "    .has(\"runID\", graph.variables().get('runID_yourid').get().trim())\n",
    "    .not(\n",
    "        has(\"app\", within(graph.variables().get('endpoits_whiteListedApp_yourid').get()))\n",
    "        .or().has(\"namespace\", within(graph.variables().get('endpoits_whiteListedNamespace_yourid').get()))\n",
    "    )\n",
    "    .criticalPaths()\n",
    "    .by(elementMap())\n",
    "    .limit(1000)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To filter them out, add the following `.not(has(...whiteListedApp...).or(...whiteListedNamespace...)` block at the start of the Gremlin queries"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Listing all attack path deduplicated by app (except the whitelisted one)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%gremlin -d class -g critical -le 50 -p inv,oute\n",
    "\n",
    "kh.endpoints()\n",
    "    .has(\"runID\", graph.variables().get('runID_yourid').get().trim())\n",
    "    .not(\n",
    "        has(\"app\", within(graph.variables().get('endpoits_whiteListedApp_yourid').get()))\n",
    "        .or().has(\"namespace\", within(graph.variables().get('endpoits_whiteListedNamespace_yourid').get()))\n",
    "    )\n",
    "    .criticalPaths()\n",
    "    .by(valueMap(\"app\", \"class\",\"critical\").with(WithOptions.tokens,WithOptions.labels))\n",
    "    .limit(10000)\n",
    "    .dedup()\n",
    "    .limit(1000)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Listing all attack path deduplicated by label/type (except the whitelisted one)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%gremlin -d class -g critical -le 50 -p inv,oute\n",
    "\n",
    "kh.endpoints()\n",
    "    .has(\"runID\", graph.variables().get('runID_yourid').get().trim())\n",
    "    .not(\n",
    "        has(\"app\", within(graph.variables().get('endpoits_whiteListedApp_yourid').get()))\n",
    "        .or().has(\"namespace\", within(graph.variables().get('endpoits_whiteListedNamespace_yourid').get()))\n",
    "    )\n",
    "    .criticalPaths()\n",
    "    .by(label())\n",
    "    .dedup()\n",
    "    .limit(1000)"
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Initialization Cell",
  "dd-sharing": {
   "allowed_groups": [
    "team-ase",
    "subproduct-secopsengineering",
    "team-aso",
    ""
   ],
   "allowed_users": [
    ""
   ],
   "retention_period": "90"
  },
  "kernelspec": {
   "display_name": "Python 3 (default)",
   "language": "python",
   "name": "ipykernel-default"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
